Some Assembly Required
A few weeks ago, I began work on the software which will drive the sorting machine. Previously, we had experimented with neural networks (more on that in a future post) and written the software for the conveyor belt controller (The Belt Buckle). The software we're discussing today are the different parts needed to capture the pictures of each part, identify it, and decide which bin it belongs in. We decided to write them in python. Jamie and I both have a bit of experience with python, and it plays nice with many AI libraries.
I had a general idea of what the software needed to do, as you can tell by reading our intro article (linked above). However it became apparent very quickly that we needed a dedicated server to manage the interactions between modules. I'll channel my inner Jordan Peterson and make up a fitting quote to describe the scenario:
"The world isn't perfect and things frequently go wrong." - Jordan Peterson might have said something like this
You see Ivan, we can have our modules pass data to each other in a nice sequence, but what happens if one module hangs? Or if several parts come by really quickly and backlog the slowest module? Or what about other strange and mystical malfunctions that I cannot predict?
Building a Server
So, I built a server. But what is a server, really? For my readers who aren't techno-mages, a server is the logical control center of a system. It tracks all of the parts as they come and go, and coordinates the work that needs to be done by each module.
So I made a new server diagram to illustrate:
The server uses an SQL database (sqlite3, specifically) to track all of the current parts on the belt. Parts are added to the database when they are scanned by the taxidermist, and removed once they are sorted. Below is a sample of what the active part database looks like. Notice the status columns on the right, we'll look at them in a second:
part number | part image | category number | part color | category name | bin assignment | server status | bb status |
224314560737 | [image] | 37 | 25 | plate | 8 | wait_sort | ack_bin |
224314548321 | [image] | 25 | 4 | hinge | 11 | sort_done | sort_done |
224314541237 | [image] | 5 | 4 | brick | wait_cf | ack_add | |
224314539217 | [image] | 11 | 13 | wheel | mtm_done | ack_add | |
224314533569 | [image] | wait_mtm | wait_ack |
The server's main loop checks for returns from each client, and then checks each line in the above database, referring to the 'server_status' and 'bb_status' columns to decide if it needs to act on any particular part. Where it says [image], the database will actually contain the image captured by the taxidermist, for use by the MTMind to identify the parts. You may also notice that certain fields are empty on the bottom three rows. When the part is first added, it looks like the bottom row, with no category, part, or color numbers and no bin assignment. As the modules each perform their task, these values are updated by the server.
Here's a rough outline of the server's loop, so you can see how this all happens:
-
Check the taxidermist for a new part. If yes, add it to the parts database, and notify the belt buckle. Set the parts "bb_status" to "wait_ack". We also set the "server_status" to "wait_mtm".
-
Check the MTMind to see if it has identified the last part we sent it. If yes, change the part's 'server_status' field in the active part DB from "wait_mtm" to "mtm_done".
-
Check the classifist to see if it has assigned a bin to the last part we sent it. If yes change the "server_status" from "wait_cf" to "cf_done"
-
Check the belt buckle for any replies, such as "part_sorted" or "bin_assigned". Update the part DB accordingly.
-
Now the server loops through the entire active part DB. For each row check the server and belt buckle status columns. If the status is "wait_mtm", "wait_cf", or "wait_sort", we do nothing. We're waiting for one of the modules to do it's job and we'll get a return from them the next time the server does steps 1-4.
For "_done" statuses, we do the following:
"mtm_done" - We have received the part number and category number back from the MTMind. Add them to the active part DB. It's time to send the part to the classifist for bin assignment.
"cf_done" - the classifist has assigned a bin number. Update the active part DB, and send the bin assignment to the Belt Buckle.
"sort_done" - The part was sorted into a bin by the BB. Remove this part from the DB. - The server has finished the loop. Go back to step 1 and do it all again. And again. And again. This loop runs 60 times per second, for all eternity. Or until the server crashes.
Fake it Till You Make it
If you're thinking that's a convoluted way of doing things, by harry you might be right. But this is programming, not a kareoke night.
Seriously now, it seems to be the best way to handle the asynchronous nature of the problem. I don't know how long it will take for the MTMind to make a decision. I can assume the classifist will be quick, and the belt buckle as well. The MTMind is a different story; some of our experiments with neural networks ranged into the 40-50 millisecond neighborhood for identifying a single image. Our Neural network will likely get slower for reasons we'll discuss in another future post (hint: We'll be using multiple networks and branching our way through them) so there's no telling how long that step will take.
Therefore, I'm building the server in this manner to handle slowdowns and even crashes. Maybe.
I'm running each module as it's own process. They all loop independently of each other with their own access to CPU time and memory space. If a module crashes or becomes unresponsive, the classifist for example, the server could attempt to kill the process and start a new one. Then the server would check the active part DB for any parts whose status was 'wait_cf' and change their status back to 'mtm_done'. Then, the next time the server loops through the active part DB, it would re-send the parts to the new classifist process.
The Fool and the Fool Who Follows Him
Other methods may exist. One option that I considered was having the taxidermist be the boss. For each part that it scanned, it would create a new process which contains all of the functions of the MTMind and classifist, and then send serial data to the belt buckle. However I quickly trashed that idea because only one process can read/write from a serial port at the same time, and the belt buckle needs to be able to send acknowledgements back to the server as well.
Also, there would probably be significant time wasted spooling up new instances of large python scripts. Also Also, loading neural network models into memory for only one prediction seemed to be a waste of disk time and memory. Maybe I'm wrong, maybe there exists some really good caching techniques, but I have never heard of anyone programming software in such a fashion.
Perhaps some other architecture, or a better version of the current one, exists out there somewhere. If you know of one, or if you're convinced I'm a poser and a fool, please send us an email, mt_pages@outlook.com. Remember that I'm making this all up as I go. I hope some of it is good.
Also, we haven't yet touched on the SUIP. The "Sorting User Interface Program" will be a shiny graphical front end for the sorting machine. Currently it runs in a python console and dumps it's output in cute little log files. I haven't really considered which GUI toolkit to use, and there are many. I also have done zero research on how GUIs work. Therefore I predict that implementing the SUIP will involve alcohol, re-writing large chunks of the server, and some amount of profanity.
That's all for now folks. We'll be back with another post in the near future detailing all of the worst ways to build the server that we've discovered.